Distributed Systems for ML Lecturer : Eric P . Xing Scribes

نویسندگان

  • Eric P. Xing
  • Scribes Petar Stojanov
  • Christoph Dann
چکیده

Coordinate descent is a general strategy for convex optimization problems. The basic idea is iteratively solve the problem by optimizing the objective only with respect to one optimization variable at a time while keeping all other dimensions fixed. While the order in which the dimensions are optimized can be chosen arbitrarily, it is crucial for convergence guarantees that updates occur sequentially.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

13 : Theory of Variational Inference

We then went on to introduce mean field variational inference, in which the variational distribution over latent variables is assumed to factor as a product of functions over each latent variable. In this sense, the joint approximation is made using the simplifying assumption that all latent variables are independent. Alternatively, we can use approximations in which only groups of variables ar...

متن کامل

Consistent Bounded-Asynchronous Parameter Servers for Distributed ML

In distributed ML applications, shared parameters are usually replicated among computing nodes to minimize network overhead. Therefore, proper consistency model must be carefully chosen to ensure algorithm’s correctness and provide high throughput. Existing consistency models used in generalpurpose databases and modern distributed ML systems are either too loose to guarantee correctness of the ...

متن کامل

Spectral Algorithms for Graphical Models Lecturer : Eric

Modern machine learning tasks often deal with high-dimensional data. One typically makes some assumption on structure, like sparsity, to make learning tractable over high-dimensional instances. Another common assumption on structure is that of latent variables in the generative model. In latent variable models, one attempts to perform inference not only on observed variables, but also on unobse...

متن کامل

17 : Markov Chain Monte

which decreases as J gets larger. So the approximation will be more accurate as we obtain more samples. Here is an example of using Monte Carlo methods to integrate away weights in Bayesian neural networks. Let y(x) = f(x,w) for response y and input x, and let p(w) be the prior over the weights w. The posterior distribution of w given the data D is p(w|D) ∝ p(D|w)p(w) where p(D|w) is the likeli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016